智能论文笔记

DLCSS: Dynamic Longest Common Subsequences

Daniel Bogdoll , Jonas Rauch , J. Marius Zöllner

分类：机器人

2022-07-13

自动驾驶是朝着更光明，更可持续的未来的关键技术。为了实现这种未来，有必要在共享的移动性模型中利用自动驾驶汽车。但是，要评估两个或更多的路线请求是否有可能进行共享乘车，这是一项计算密集的任务，如果通过重新处理完成。在这项工作中，我们提出了动态最长的常见子序列算法，以便对两种途径进行快速和成本效益的兼容性比较，并动态地合并了适合共享跳闸的路线的一部分。基于此，还可以估计，满足当地出行需求可能需要多少个自动驾驶汽车。这可以帮助提供者估算必要的车队规模，决策者更好地了解出行模式和城市以扩展必要的基础设施。

translated by 谷歌翻译

Machine Learning based Framework for Robust Price-Sensitivity Estimation with Application to Airline Pricing

Ravi Kumar , Shahin Boluki , Karl Isler , Jonas Rauch , Darius Walczak

分类： (统计)机器学习 | 机器学习

2022-05-04

We consider the problem of dynamic pricing of a product in the presence of feature-dependent price sensitivity. Developing practical algorithms that can estimate price elasticities robustly, especially when information about no purchases (losses) is not available, to drive such automated pricing systems is a challenge faced by many industries. Based on the Poisson semi-parametric approach, we construct a flexible yet interpretable demand model where the price related part is parametric while the remaining (nuisance) part of the model is non-parametric and can be modeled via sophisticated machine learning (ML) techniques. The estimation of price-sensitivity parameters of this model via direct one-stage regression techniques may lead to biased estimates due to regularization. To address this concern, we propose a two-stage estimation methodology which makes the estimation of the price-sensitivity parameters robust to biases in the estimators of the nuisance parameters of the model. In the first-stage we construct estimators of observed purchases and prices given the feature vector using sophisticated ML estimators such as deep neural networks. Utilizing the estimators from the first-stage, in the second-stage we leverage a Bayesian dynamic generalized linear model to estimate the price-sensitivity parameters. We test the performance of the proposed estimation schemes on simulated and real sales transaction data from the Airline industry. Our numerical studies demonstrate that our proposed two-stage approach reduces the estimation error in price-sensitivity parameters from 25\% to 4\% in realistic simulation settings. The two-stage estimation techniques proposed in this work allows practitioners to leverage modern ML techniques to robustly estimate price-sensitivities while still maintaining interpretability and allowing ease of validation of its various constituent parts.

translated by 谷歌翻译

Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models

Daniel Bogdoll , Johannes Jestram , Jonas Rauch , Christin Scheib , Moritz Wittig , J. Marius Zöllner

分类：机器学习

2021-11-05

在可预见的未来，自治车辆将在他们无法自行解决的情况下需要人类的帮助。在这种情况下，来自人类的远程辅助可以为车辆提供所需的输入来继续其操作。自动车辆中使用的典型传感器包括相机和激光雷达传感器。由于必须实时发送的传感器数据量的大量，高效的数据压缩是基本上的，以防止网络基础设施过载。使用深生成的神经网络的传感器数据压缩已经显示为图像和激光雷达数据的传统压缩方法，关于压缩率以及重建质量。然而，缺乏关于基于生成 - 神经网络的压缩算法进行远程辅助的性能的研究。为了在远程辅助中深入了解使用深度生成模型的可行性，我们评估了最先进的算法，了解其适用性并识别潜在的弱点。此外，我们实施了用于处理传感器数据的在线管道，并使用Carla模拟器演示其对远程辅助的性能。

translated by 谷歌翻译

NISQ-ready community detection based on separation-node identification

Jonas Stein , Dominik Ott , Mirco Schoenfeld , Sebastian Feld

分类：机器学习

2022-12-30

The analysis of network structure is essential to many scientific areas, ranging from biology to sociology. As the computational task of clustering these networks into partitions, i.e., solving the community detection problem, is generally NP-hard, heuristic solutions are indispensable. The exploration of expedient heuristics has led to the development of particularly promising approaches in the emerging technology of quantum computing. Motivated by the substantial hardware demands for all established quantum community detection approaches, we introduce a novel QUBO based approach that only needs number-of-nodes many qubits and is represented by a QUBO-matrix as sparse as the input graph's adjacency matrix. The substantial improvement on the sparsity of the QUBO-matrix, which is typically very dense in related work, is achieved through the novel concept of separation-nodes. Instead of assigning every node to a community directly, this approach relies on the identification of a separation-node set, which -- upon its removal from the graph -- yields a set of connected components, representing the core components of the communities. Employing a greedy heuristic to assign the nodes from the separation-node sets to the identified community cores, subsequent experimental results yield a proof of concept. This work hence displays a promising approach to NISQ ready quantum community detection, catalyzing the application of quantum computers for the network structure analysis of large scale, real world problem instances.

translated by 谷歌翻译

Non-intrusive surrogate modelling using sparse random features with applications in crashworthiness analysis

Maternus Herold , Anna Veselovska , Jonas Jehle , Felix Krahmer

分类：机器学习 | (统计)机器学习

2022-12-30

Efficient surrogate modelling is a key requirement for uncertainty quantification in data-driven scenarios. In this work, a novel approach of using Sparse Random Features for surrogate modelling in combination with self-supervised dimensionality reduction is described. The method is compared to other methods on synthetic and real data obtained from crashworthiness analyses. The results show a superiority of the here described approach over state of the art surrogate modelling techniques, Polynomial Chaos Expansions and Neural Networks.

translated by 谷歌翻译

Restricting to the chip architecture maintains the quantum neural network accuracy, if the parameterization is a $2$-design

Lucas Friedrich , Jonas Maziero

分类：人工智能 | 机器学习

2022-12-29

In the era of noisy intermediate scale quantum devices, variational quantum circuits (VQCs) are currently one of the main strategies for building quantum machine learning models. These models are made up of a quantum part and a classical part. The quantum part is given by a parametrization $U$, which, in general, is obtained from the product of different quantum gates. By its turn, the classical part corresponds to an optimizer that updates the parameters of $U$ in order to minimize a cost function $C$. However, despite the many applications of VQCs, there are still questions to be answered, such as for example: What is the best sequence of gates to be used? How to optimize their parameters? Which cost function to use? How the architecture of the quantum chips influences the final results? In this article, we focus on answering the last question. We will show that, in general, the cost function will tend to a typical average value the closer the parameterization used is from a $2$-design. Therefore, the closer this parameterization is to a $2$-design, the less the result of the quantum neural network model will depend on its parametrization. As a consequence, we can use the own architecture of the quantum chips to defined the VQC parametrization, avoiding the use of additional swap gates and thus diminishing the VQC depth and the associated errors.

translated by 谷歌翻译

Cramming: Training a Language Model on a Single GPU in One Day

Jonas Geiping , Tom Goldstein

分类：自然语言处理 | 机器学习

2022-12-28

Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment where training language models is out of reach for most researchers and practitioners. While most in the community are asking how to push the limits of extreme computation, we ask the opposite question: How far can we get with a single GPU in just one day? We investigate the downstream performance achievable with a transformer-based language model trained completely from scratch with masked language modeling for a single day on a single consumer GPU. Aside from re-analyzing nearly all components of the pretraining pipeline for this scenario and providing a modified pipeline with performance close to BERT, we investigate why scaling down is hard, and which modifications actually improve performance in this scenario. We provide evidence that even in this constrained setting, performance closely follows scaling laws observed in large-compute settings. Through the lens of scaling laws, we categorize a range of recent improvements to training and architecture and discuss their merit and practical applicability (or lack thereof) for the limited compute setting.

translated by 谷歌翻译

Continual Causal Abstractions

Matej Zečević , Moritz Willig , Jonas Seng , Florian Peter Busch

分类：人工智能

2022-12-23

This short paper discusses continually updated causal abstractions as a potential direction of future research. The key idea is to revise the existing level of causal abstraction to a different level of detail that is both consistent with the history of observed data and more effective in solving a given task.

translated by 谷歌翻译

ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models

Jonas Belouadi , Steffen Eger

分类：自然语言处理

2022-12-20

State-of-the-art poetry generation systems are often complex. They either consist of task-specific model pipelines, incorporate prior knowledge in the form of manually created constraints or both. In contrast, end-to-end models would not suffer from the overhead of having to model prior knowledge and could learn the nuances of poetry from data alone, reducing the degree of human supervision required. In this work, we investigate end-to-end poetry generation conditioned on styles such as rhyme, meter, and alliteration. We identify and address lack of training data and mismatching tokenization algorithms as possible limitations of past attempts. In particular, we successfully pre-train and release ByGPT5, a new token-free decoder-only language model, and fine-tune it on a large custom corpus of English and German quatrains annotated with our styles. We show that ByGPT5 outperforms other models such as mT5, ByT5, GPT-2 and ChatGPT, while also being more parameter efficient and performing favorably compared to humans. In addition, we analyze its runtime performance and introspect the model's understanding of style conditions. We make our code, models, and datasets publicly available.

translated by 谷歌翻译

VoronoiPatches: Evaluating A New Data Augmentation Method

Steffen Illium , Gretchen Griffin , Michael Kölle , Maximilian Zorn , Jonas Nüßlein , Claudia Linnhoff-Popien

分类：计算机视觉 | 机器学习

2022-12-20

Overfitting is a problem in Convolutional Neural Networks (CNN) that causes poor generalization of models on unseen data. To remediate this problem, many new and diverse data augmentation methods (DA) have been proposed to supplement or generate more training data, and thereby increase its quality. In this work, we propose a new data augmentation algorithm: VoronoiPatches (VP). We primarily utilize non-linear recombination of information within an image, fragmenting and occluding small information patches. Unlike other DA methods, VP uses small convex polygon-shaped patches in a random layout to transport information around within an image. Sudden transitions created between patches and the original image can, optionally, be smoothed. In our experiments, VP outperformed current DA methods regarding model variance and overfitting tendencies. We demonstrate data augmentation utilizing non-linear re-combination of information within images, and non-orthogonal shapes and structures improves CNN model robustness on unseen data.

translated by 谷歌翻译